This report provides an evaluation of Experimental Target forecasts of confirmed Influenza Hospitalization Admissions for the 2021-2022 and 2022-2023 season to the CDC FluSight Forecast Hub. Experimental target submissions began on December 12, 2022. The purpose of introducing an additional experimental target is to provide an opportunity for forecasting teams to submit forecasts for increasing and decreasing activity. The experimental target, named “2 wk flu hosp rate change” will be submitted as estimates of the probability of occurrence for each rate change category. Guidance for submission of the experimental target is shown here.
In additional to the models submitted to github repository for CDC FluSight hospitalization data-experimental for the 2022-2023 season, models from CDC FluSight hospitalization data-forecasts, that did not submit experimental forecasts were added to this evaluation for both the 2021-2022 and 2022-2023 season. We used a version of this code to convert these into experimental target forecasts.
This report evaluates experimental forecasts at the state and national level for confirmed Influenza Hospitalization Admissions for the 2021-2022 and 2022-2023 season. Data from the HealthData is used as ground truth data for evaluating the forecasts.
We evaluate models based on the Brier Score. To account for the variation in difficult of forecasting different weeks and locations, a pairwise approach was used to calculate the adjusted relative Brier score. Models with relative scores lower than 1 have been more accurate than the baseline on average, whereas relative scores greater than 1 indicate less accuracy than baseline on average. The Flusight-baseline model is used for the baseline in the pairwise comparison.
The table evaluates forecast models based on the relative Brier Score, aggregated across weeks and location.
Inclusion criteria for each column are detailed below the table.
The models included have submitted at least 50% of forecasts during this time, where one forecast is a location, target, forecast date combination. The data are initially ordered by model based on their relative Brier score aggregated across time and location, with the most accurate models at the top.
In the following figures, we have evaluated models across multiple forecasting weeks. Points included in this comparison are for all models that have submitted experimental target forecasts. The models in the legend with a dot and line have scores for every week. The models with just a line are missing scores for at least one week.
Brier score is used as a metric. The figure shows the mean Brier score across all locations for submission weeks.
The models included have submitted at least 50% of forecasts during this time, where one forecast is a location, target, forecast date combination.
In the following figures, we have evaluated models across multiple forecasting weeks. Points included in this comparison are for all models that have submitted experimental target forecasts. The models in the legend with a dot and line have scores for every week. The models with just a line are missing scores for at least one week.
Brier score is used as a metric. The figure shows the mean Brier score across all locations for submission weeks.
The table evaluates forecast models based on the relative Brier Score, aggregated across weeks and location.
Inclusion criteria for each column are detailed below the table.
The models included have submitted at least 50% of forecasts during this time, where one forecast is a location, target, forecast date combination. The data are initially ordered by model based on their relative Brier score aggregated across time and location, with the most accurate models at the top.
In the following figures, we have evaluated models across multiple forecasting weeks. Points included in this comparison are for all models that have submitted experimental target forecasts. The models in the legend with a dot and line have scores for every week. The models with just a line are missing scores for at least one week.
Brier score is used as a metric. The figure shows the mean Brier score across all locations for submission weeks.
The models included have submitted at least 50% of forecasts during this time, where one forecast is a location, target, forecast date combination.
In the following figures, we have evaluated models across multiple forecasting weeks. Points included in this comparison are for all models that have submitted experimental target forecasts. The models in the legend with a dot and line have scores for every week. The models with just a line are missing scores for at least one week.
Brier score is used as a metric. The figure shows the mean Brier score across all locations for submission weeks.
This figure shows the number of confirmed Influenza Hospitalization Admissions reported each week in the US for the 2022-2023 season. .